Top-k-FCI: Mining Top-K Frequent Closed Itemsets in Data Streams

نویسندگان

  • Jun LI
  • Sen GONG
  • J. Li
چکیده

With the generation and analysis of stream data, such as network monitoring in real time, log records, click streams, a great deal of attention has been concerned on data streams mining in the field of data mining. In the process of the data streams mining, it is more reasonable to ask users to set a bound on the result size. Therefore, in this paper, an real-time single-pass algorithm, called Top-k-FCI (top-K frequent closed itemsets of data streams), is proposed for mining top-K closed itemsets from data streams efficiently. A novel algorithm, called can (T)(candidate itemset of the T), is developed for mining the essential candidate of closed itemsets generated so far. Experimental results show that the proposed Top-k-FCI algorithm is an efficient method for mining top-K frequent itemsets from data streams.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Top-k Frequent Closed Itemsets in Data Streams Using Sliding Window

Frequent itemset mining has become a popular research area in data mining community since the last few years. There are two main technical hitches while finding frequent itemsets. First, to provide an appropriate minimum support value to start and user need to tune this minimum support value by running the algorithm again and again. Secondly, generated frequent itemsets are mostly numerous and ...

متن کامل

Efficient Incremental Mining of Top-K Frequent Closed Itemsets

In this work we study the mining of top-K frequent closed itemsets, a recently proposed variant of the classical problem of mining frequent closed itemsets where the support threshold is chosen as the maximum value sufficient to guarantee that the itemsets returned in output be at least K. We discuss the effectiveness of parameter K in controlling the output size and develop an efficient algori...

متن کامل

GC-Tree: A Fast Online Algorithm for Mining Frequent Closed Itemsets

Frequent closed itemsets is a complete and condensed representaion for all the frequent itemsets, and it’s important to generate non-redundant association rules. It has been studied extensively in data mining research, but most of them are done based on traditional transaction database environment and thus have performance issue under data stream environment. In this paper, a novel approach is ...

متن کامل

The Top-k Frequent Closed Itemset Mining Using Top-k SAT Problem

In this paper, we introduce a new problem, called Top-k SAT, that consists in enumerating the Top-k models of a propositional formula. A Top-k model is defined as a model with less than k models preferred to it with respect to a preference relation. We show that Top-k SAT generalizes two well-known problems: the partial Max-SAT problem and the problem of computing minimal models. Moreover, we p...

متن کامل

The Algorithm of Mining Frequent Closed Itemsets Based on Index Array

The set of frequent closed itemsets determines exactly the complete set of all frequent itemsets and is usually much smaller than the latter. In this paper, an algorithm based on index array for mining frequent closed itemsets, Index-FCI is proposed. The vertical BitTable is adopted to compress the dataset for counting fast the support. To make use of the horizontal BitTable, the index array co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011